Skip to content

Conversation

@Yogu
Copy link
Member

@Yogu Yogu commented Oct 30, 2025

  • regression tests store AQL in files
  • new regression tests with "memory pitfalls" that show us where a document is loaded into memory even if it should not
  • (work in progress) changes in AQL generation so it's simpler and fixes a few memory issues
    • avoid patterns that disable the reduce-extraction-to-projection opitimizations
    • avoid subqueries
    • generally simplify queries

@Yogu Yogu force-pushed the refactor-traversal branch 2 times, most recently from 70c7e4b to 1b60bac Compare October 30, 2025 16:32
Yogu added 11 commits October 30, 2025 17:37
Regression tests usually don't have many or big objects, so they shouldn't take a lot of memory. If they do, the query is likely not optimal.

If the value (10 MB) does not work we can also tweak it later.

This can also be used by special tests that have a large "payload" field to ensure this field never makes it into query memory.

Needed to refactor one query because it was one query that performed a lot of filters.
The original code incorrectly assumed that a log4js logger had its own level. Instead, changing a logger's level changes the level of the logger's category system-wide. That leads to very confusing behavior that depends on the order in which you create projects or database adapters.

We're still changing global state for the trace level, but in a direct and more understandable way.
This was not buggy, but it's also unnecessarily complicated in these situations.
The two ways introduce unnecessary complexity. We also will add a feature to output AQL strings for each operation, which requires operation names. Also, some of the anonymous operations were better split into multiple named operations.
There are quite a lot of files possible per test, so this is easier to scan.

Also, we will add AQL files where multiple files will be generated per test (one per operation).

Also, it's now easier to copy a test by just copying the directory.

access-restrictions/logistics-reader was accidentally deleted in eed1127, restored.
This has mainly two use cases

- Getting a general feeling of how queries are translated into AQL to discover potential for optimizations and simplifications
- Double-checking that a code change does not change AQL in a bad way, even if you don't notice it in the result with the test data
@Yogu Yogu force-pushed the refactor-traversal branch 3 times, most recently from 7e03cc3 to 47468d5 Compare October 30, 2025 16:46
@Yogu Yogu requested a review from Copilot October 30, 2025 16:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves query memory usage by simplifying AQL generation and storing regression test queries in files. The changes focus on optimizing query patterns to enable better database-level optimizations and avoid unnecessary memory consumption.

Key changes:

  • Regression tests now store AQL queries in separate .aql files for better tracking and comparison
  • Refactored query generation to eliminate patterns that prevent reduce-extraction-to-projection optimizations
  • Introduced new regression tests demonstrating memory-efficient query patterns

Reviewed Changes

Copilot reviewed 191 out of 1032 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
spec/regression/initialization.ts Refactored to use new InitTestDataContext class for test data initialization
spec/regression/init-test-data-context.ts New context class encapsulating GraphQL execution logic for test initialization
spec/regression/list-limits/tests//aql/.aql New AQL files for list limits regression tests
spec/regression/keywords/tests//aql/.aql New AQL files for keyword escaping tests
spec/regression/collect/tests//aql/.aql New AQL files for collection operation tests
spec/regression/access-restrictions/tests//aql/.aql New AQL files for access restriction tests
spec/regression/access-groups/tests//aql/.aql New AQL files for access group tests
spec/regression/collect/tests/*/result.json Updated result files showing work-in-progress error states
spec/regression/access-restrictions/tests/accounting/test.graphql Added query name to GraphQL test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Yogu Yogu force-pushed the refactor-traversal branch 6 times, most recently from fe0a03a to 0dbe80e Compare October 31, 2025 15:51
@Yogu Yogu force-pushed the refactor-traversal branch 3 times, most recently from 053beb9 to db1f075 Compare November 3, 2025 16:40
Yogu added 3 commits November 4, 2025 17:15
…ields

The tests currently assert that the memory limit is exceeded. This is expected and will be fixed in an upcoming commit by a performance improvement.
This is the recommended value for Node 18: https://github.com/tsconfig/bases/blob/main/bases/node18.json

It includes findLastIndex() which is useful for us
This avoids variable names changing when features like sorting and filtering are added or removed, causing changes in .aql files of the regression tests
@Yogu Yogu force-pushed the refactor-traversal branch 6 times, most recently from 81c67b6 to 707cfd0 Compare November 6, 2025 17:20
Yogu added 3 commits November 6, 2025 18:20
The null checks prevent arangodb from applying the reduce-extraction-to-projection optimization.
… in ArangoDB through dangling-edge filter

We currently do not trust edges - if the document they point to does not exist, we ignore the edge.

In the past, this access was done via $node != null. This has the problem that ArangoDB considers it a full document access and no longer applies the reduce-extraction-to-projection optimization rule. This rule is very important to limit query memory limit usage.

Removing the dangling edges filter completely would be breaking, but we can optimize them so the reduce-extraction-to-projection optimization still works.
@Yogu Yogu force-pushed the refactor-traversal branch from 707cfd0 to 519a613 Compare November 6, 2025 17:21
Yogu added 2 commits November 7, 2025 17:28
will add some more to this commit later
Instead of collecting { root, obj } pairs in a list, we now generate the whole collect result including field selection in one subquery.

WIP: InMemoryAdapter (do this last when we're sure about the query tree API)
@Yogu Yogu force-pushed the refactor-traversal branch from 519a613 to ad8f739 Compare November 7, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants